Differential expression analysis for RNAseq using Poisson mixed models

نویسندگان

  • Shiquan Sun
  • Michelle Hood
  • Laura Scott
  • Qinke Peng
  • Sayan Mukherjee
  • Jenny Tung
  • Xiang Zhou
چکیده

Identifying differentially expressed (DE) genes from RNA sequencing (RNAseq) studies is among the most common analyses in genomics. However, RNAseq DE analysis presents several statistical and computational challenges, including over-dispersed read counts and, in some settings, sample non-independence. Previous count-based methods rely on simple hierarchical Poisson models (e.g. negative binomial) to model independent over-dispersion, but do not account for sample non-independence due to relatedness, population structure and/or hidden confounders. Here, we present a Poisson mixed model with two random effects terms that account for both independent over-dispersion and sample non-independence. We also develop a scalable sampling-based inference algorithm using a latent variable representation of the Poisson distribution. With simulations, we show that our method properly controls for type I error and is generally more powerful than other widely used approaches, except in small samples (n <15) with other unfavorable properties (e.g. small effect sizes). We also apply our method to three real datasets that contain related individuals, population stratification or hidden confounders. Our results show that our method increases power in all three data compared to other approaches, though the power gain is smallest in the smallest sample (n = 6). Our method is implemented in MACAU, freely available at www.xzlab.org/software.html.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MultiRankSeq: Multiperspective Approach for RNAseq Differential Expression Analysis and Quality Control

BACKGROUND After a decade of microarray technology dominating the field of high-throughput gene expression profiling, the introduction of RNAseq has revolutionized gene expression research. While RNAseq provides more abundant information than microarray, its analysis has proved considerably more complicated. To date, no consensus has been reached on the best approach for RNAseq-based differenti...

متن کامل

RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment

Sample size and power determination is the first step in the experimental design of a successful study. Sample size and power calculation is required for applications for National Institutes of Health (NIH) funding. Sample size and power calculation is well established for traditional biological studies such as mouse model, genome wide association study (GWAS), and microarray studies. Recent de...

متن کامل

Factors Affecting Hospital Length of Stay Using Mixed Poisson Regression Models

Background and purpose: Modeling of Hospital Length of Stay (LOS) is of great importance in healthcare systems, but there is paucity of information on this issue in Iran. The aim of this study was to identify the optimal model among different mixed poisson distributions in modeling the LOS and effective factors. Materials and methods: In this cross-sectional study, we studied 1256 records, inc...

متن کامل

RNAontheBENCH: computational and empirical resources for benchmarking RNAseq quantification and differential expression methods

RNA sequencing (RNAseq) has become the method of choice for transcriptome analysis, yet no consensus exists as to the most appropriate pipeline for its analysis, with current benchmarks suffering important limitations. Here, we address these challenges through a rich benchmarking resource harnessing (i) two RNAseq datasets including ERCC ExFold spike-ins; (ii) Nanostring measurements of a panel...

متن کامل

ASpli: An integrative R package for analysing alternative splicing using RNAseq

3 Running ASpli 2 3.1 To BIN or not to BIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3.2 Read Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2.1 Mapping file and bam loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.2.2 Overlap features and read ali...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2017